Automatic Prior Art Searching and Patent Encoding at CLEF-IP '10
نویسندگان
چکیده
In the intellectual property field two tasks are of high relevance: prior art searching and patent classification. Prior art search is fundamental for many strategic issues such as patent granting, freedom to operate and opposition. Accurate classification of patent documents according to the IPC code system is vital for the interoperability between different patent offices and for the prior art search task involved in a patent application procedure. In this paper, we report our experiments with prior art searching and patent classification in the context of CLEF-IP ’10 evaluation track. In the Prior Art Candidates search task, we strongly improved our last year’s model based on our experiments on training data (MAP 0.22), but official results, alas, were far from the expected ones (MAP 0.14). Regarding multilingual issues, our simple Google translator strategy achieved a 10% improvement. Nevertheless we think that the multilingual aspects in CLEF-IP’10 were less clear than for CLEF-IP’09. Finally, exploiting applicant’s citations led to a 30% improvement, but their visibility depends on who (the applicant or the examiner) performs the prior art search in the simulated task. This issue needs clarification by the organizers for the forthcoming campaigns. In the Classification task, we apply the k-NN algorithm in the categorisation process and explore different retrieval models, ranking combinations and languages features in order to enhance our results. Using multi-collection in the classification process improved the results by 2%. Both the prior art search and classification systems are in the top three rank among the participants.
منابع مشابه
Report on the CLEF-IP 2011 Experiments: Exploring Patent Summarization
This technical report presents the work carried out for the Prior Art Candidate Search track of CLEF-IP 2011. In this search scenario, information need is expressed as a patent document (query topic). We compare two methods for estimating query model from the patent document to support summary-based query modeling and descriptionbased query modeling. The former approach utilizes a known text su...
متن کاملExperiments with Citation Mining and Key-Term Extraction for Prior Art Search
This technical note presents the system built for the IP track of CLEF 2010 based on PATATRAS (PATent and Article Tracking, Retrieval and AnalysiS), the modular search infrastructure initially realized for CLEF IP 2009. We largely reused the system of the previous CLEF IP but at a relatively smaller scale and with the improvement of three main components: • A new citation mining tool based on C...
متن کاملCLEF-IP 2010: Retrieval Experiments in the Intellectual Property Domain
In the recent decade that research in IR methods for Intellectual Property domain has increased. The rst e orts in observing how information retrieval is done in patent domain were done with the series of Nist workshops (see for example [2]). Lately, more workshops and conferences are dedicated to bringing together IR and IP specialists [3,8]. In 2008, the Irf obtained the agreement to coordina...
متن کاملCLEF-IP 2011: Retrieval in the Intellectual Property Domain
The patent system is designed to encourage disclosure of new technologies and novel ideas by granting exclusive rights on the use of inventions to their inventors, for a limited period of time. Before a patent can be granted, patent o ces around the world perform thorough searches to ensure that no previous similar disclosures were made. In the intellectual property terminology, such kind of se...
متن کاملUniversity of Santiago de Compostela at CLEF-IP09
In this paper we describe our participation in CLEF-IP 2009 (prior-art search task). This was the first year of the task and we focused on how to build effectively a prior art query from a patent. Basically, we implemented simple strategies to extract terms from some textual fields of the patent documents and gave preference to title terms. We ran experiments with standard BM25 configurations a...
متن کامل